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EfBcient Indexmg Of Hierarchical Relational Database Records 

Inventor : 
Mark A. Gaponoff 

CROSS REFERENCE TO RELATED APPLICATIONS 

[CWOl] This application claims the benefit of U.S. Provisional Application No. 
60/544,015, entitled "Efficient Indexing Of Hierarchical Relational database Entries", filed 
' February 11, 2004, vMch is incorporated by reference herein, 

BACKGROUND 

Field of the Invention 

[0002] This invention relates in general to storing hierarchical data structures in 
relational databases, and more particularly, to efficient indexing of hierarchical data 
structures in relational databases. 
Background Art 

[0003] Hierarchical data structures, such tree data structures, are commonly used to 
represent data. Tree data structures are made of nodes, which can contain both data to be 
stored and pointers to other nodes in the tree. The terms "hierarchical data structure", "tree 
data structure" and **tree" are used interchangeably throiighout this disclosure. Nodes in a 
tree structure are in an ancestor-child relationship. A root node is the topmost node that has 
no ancestor nodes. A tree structure also has leaf nodes and internal nodes. An internal node 
is a node that has one or more child nodes. A leaf node has no children. All children of the 
same node are siblings. Referring now Fig. 1, an exemplary tree data structure 160 is shown. 
In Fig. 1, node "Alpha" is a root node that has the following three children: 'Bravo", 
"Charlie", and "Julief Since Bravo, Charlie, and Juliet are children of the same parent. 
Alpha, these nodes are siblings (or 'peers'). In the tree structure shown in Fig. 1, each node 
is identified by its unique ID (OID), the unique ID of its parent (Parent_OID), and its 
sequential number among siblings (Peer Sequence). 

[0004] It has become commonplace to represent hierarchical data structures in 
relational databases. An example of a relational database representation of the tree structure 
illustrated in Fig. 1 is shown below in Table 1 . 

1 
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Table 1 : Database Representation of Tree Data Structure 1 60 
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[0005] Table 1 stores a plurality of records. Each record is associated with a node in 
the tree data structure and includes a plurality of fields. An exemplary record shown in Table 
1 includes a DATA field that stores data associated with a particular ndde in the data 
structure, and OID, Parent_OID, Peer_Sequence, and Level fields. 

[0006] The OID field stores a unique object ID of a given node. An object ID can be 
any value that uniquely identifies the node. In a preferred embodiment, ODD is a numeric 
value. 

[0007] The Parent_OID field stores a pointer or reference to the parent of the node. 

When the node is a root node, it has no parent In one implementation, the value of 

Parent OID for the root node is NULL or a null-like value, such as zero. 

[0008] The Peer Sequence field stores an index value identifying the sequential order 

of a node relative to its siblings. Peer Sequence can be any value capable of being 

transformed into a constant-width string. In a preferred embodiment, Peer_Sequence can be 

any numeric value. The Peer_Sequence of the first node among siblings of the same parent 

can also indicate tiie interval by which Peer_Sequence of a subsequent sibling will be 

incremented. For example, if the numeric Peer_Sequence value of the first sibling is "10", 

Peer_Sequence for each subsequent sibling node is incremented by "10". 

[0009] The LEVEL field stores the sequential number of the level at which the node is 

located in the hierarchy. A hierarchical data structure can mclude any number of levels. The 

value of LEVEL is derived fix)m the position of the node in the hierarchy. 

[0010] The DATA field stores the value of the data associated with a given node. It 

could actually represent a collection of fields associated with each node. 

[0011] A database user ofl:en needs to examine relationships across multiple levels of 

the tree hierarchy or needs to identify the next node following a given node in the hierarchy. 
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One of the largest drawbacks of a relational database is its inability to handle queries 
efficiently. For example, to answer a question like "Is India a descendant of Charlie?" in the 
tree data structure shown in Fig. 1, one woxild have to perform a pointer chain traversal by 
iteratively traversing nodes in the tree data structure using a reference to the parent node. 
This process involves issuing several queries or joins within one query to find out if the 
PARENT_OID of India is the OID of CharUe, or if the PARENT_OID of the PARENT_OID 
of India is the OED of Charlie, etc. untQ the top of the hierarchy is reached. Below are 
sample SQL queries that need to be executed to perform the pointer-chain traversal of a data 
structure. 
Now = India 

While (up.PARENT_OID is not nuU) and (upJPARENT^OID o Charlie.OID) 

Do 

{ 

select up.PARENT_OID firom AJTREE now, A_TREE up where up.OID == 

now.PARENT_OID; 

up = now 

. } 

If up.PARENT_OID is null, India is NOT a descendent of Charlie, otherwise it is. 

[0012] This query is executed multiple tunes until a match is foimd or the root node in 
the tree structure is reached. To search more efficiently, one query that combines the 
previous queries can be issued. The problems with the above solution are that tihiey either 
require an iterative series of queries of unknown lengdi or a single complex and inefficient 
query. 

[0013] Often, it is desirable to represent a tree data structure in a tree display similar to 

a computer directory tree structure, showing the hierarchy of the nodes, such as the one 

shown below: 

Alpha 

+Bravo 

-H-Delta 

+Charlie 

-H-Echo 

-H-Foxtrot 

■H-hGolf 
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-i-H-f India 

-hK-Hotel 

+JuKet 

[0014] In this example, the *+' signs prejHxing the DATA name indicate that a given 
node is an immediate child of the nearest preceding node in the data structure with fewer (or 
no) signs. The existing solutions for ordering a hierarchical data structure have been left 
for programmers since this type of ordering caimot be accomphshed with a single SQL query. 
[0015] Accordingly, there is a need for a mechanism that provides for more efficient 
traversal of hierarchical data structures and avoids the complexities of prior art techniques. 

DISCLOSURE OF THE INVENTION 
[0016] The above need is met by a sorting construct (TREE__SORT), whicb is an index 
added to a relational database that stores hierarchical data structures. Adding the sorting 
construct provides for efficient use of sorting-type functions of the structured query language 
(SQL) to traverse among multiple levels of hierarchical data structures. 
[0017] According to one embodiment of the present invention, to generate the 
TElEE_SORT construct, a hierarchical data structure is traversed, starting from the root node. 
A TREE SORT value for a root node is generated by transforming a Peer_Sequence value of 
the root node to a left-justified string of a given width. Peer Sequence values identify a 
sequential order of nodes that have the same parent in a hierarchical data structure. 
Peer_Sequence values are monotonically increasing among the children of the same parent 
[0018] According to an embodiment of the present invention, TREE SORT values for 
other nodes in tree data structure are generated as follows: Peer Sequence values for children 
nodes are transformed to a left-justified substring of characters of a given width, and a 
TREE_SORT value of the parent node is prepended to the left of the generated substring of 
the transformed Peer__Sequence value. The resulting TREE SORT value for a given node, 
other than the root node, includes the following components: a leading substring that 
represents a TREE_SORT value of the parent node and the transformed Peer_Sequence value 
for that node. Thus, the TREE_SORT value represents the chain of Peer^Sequence values 
starting from the root node of tiie tree data structure and going down to the specific node. 
[0019] Importantiy, the generated TREE_SORT values are ordered according to the 
same collating sequence as Peer_Sequence values. A collating sequence is an ordering 
system that determines whether a particxilar character sorts higher, lower, or the same as 



wo 2005/077123 PCT/US2005/004653 

another. Thus, if a Peer_Sequence of node A is ordered before a Peer^Sequence of the node 
B, the transformed value of the Peer_Sequence of node A is ordered before the transformed 
value of the Peer^Sequence for node B. 

[0020] hi a preferred embodiment of the present invention, the generated TREE_SORT 
values are of a STRING data type. As a result, the present invention advantageously uses 
SQL string-type functions, such as SORT (ORDER BY) and LIKE, to efiBcientiy traverse 
across multiple levels of a hierarchical data structure. When a tree data structure is sorted 
according to a TREE_SORT construct, descendant nodes of a particular node have 
TREE^SORT values that sort after the coromon ancestor's TREE_SORT value. This 
observation allows the system to execute efficient queries to, for example, identify ancestors 
or descendants of a given node, or the next node in the hierarchy, and yet to avoid traversing 
across multiple levels of a hierarchical data structure using a pointer-chain traversal. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0021] FIGURE 1 is an exemplary hierarchical tree data structure. 
[00^] FIGURE 2 is a block diagram of a system for providing efficient indexing of 
hierarchical data in relational databases according to an embodiment of the present invention. 
[0023] FIGURE 3 is a flow diagram of the steps performed by the present invention. 
[0024] The figures depict embodiments of the present invention for purposes of 
illustration only. One skilled in the art will readily recognize from the following description 
that alternative embodiments of the structures and methods illustrated herein may be 
employed without departing fi'om the principles of the invention described herein. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
1, System Architecture Overview 

[0025] Fig. 2 is a block diagram of system 200 for providing eflBcient indexing of 
hierarchical data in relational databases according to an embodiment of the present invention. 
System 200 includes a data store 220, a sorting construct generation engine 240, and a 
database engine 260. 

[0026] Data store 220 holds database having records, hi one embodiment, when system 
200 is implemented in a healthcare enviromnent, data store 220 maintains medical records. 
Data store 220 can be implemented, for example, as a relational database management system 
(RDMBS), and queries to the data store 220 are accompUshed via SQL. An example of a 
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relatioiial database representation of the tree structure illustrated in Fig. 1 was shown in Table 
1. 

[0027] Sorting construct generation engine 240 is adapted to generate sorting constructs 
TREE SORT using transformed Peer__Sequence values according to an algorithm, which will 
be described in greater detail in the 'Methods of Opemtion" section. Peer_Sequence values 
represent a sequential ordering among siblings of the node having the same parent Briefly, 
the generated TREE_SORT is an ordered construct of a fixed width ordered according to the 
. same system as the transformed Peer__Sequence values. In a preferred embodiment, 
TREE_SORT construct is of a STRING data type. Usmg a TREE_SORT construct allows 
system 200 to efficiently sort and retrieve data in a hierarchical data structure. 
[0028] Database engine 260 is adapted to use generated TREE_SORT constructs to 
execute efficient queries on data store 220. Operations performed by database engine 260 
will be described in more detail below in the "Methods of Operation" section. 
[0029] In one embodiment, engines 220, 240, and 260 are implemented as modules. As 
used herein, the term "module" refers to computer program code ad^ted to provide the 
functionality attributed to the module. The program code is embodied in a random access 
memory (RAM), a read-only memory (ROM) or other media. 
2. Methods of Operation 
Generation of TREE SORT construct 

[0030] As was previously described, engine 240 generates TREE_SORT constructs 
usuig Peer_Sequence values. At this point, it is beneficial to describe the Peer_Sequence 
value and its transformed value in greater detail. 

[0031] As radicated earlier, a Peer_Sequence value represents a sequential ordering of 
siblings of the same node. The Peer_Sequence value can be any value capable of being 
transformed into a fixed-length string having a given width. Peer_Sequence values are 
monotonically increasing among children of a given parent A given instance of engine 240 
will transform Peer_Sequence into a fixed-length value that maintains the same ordering 
among its peers, but is of a sortable and concatenable data type. Transformed Peer_Sequence 
values are capable of being ordered according an ordering system that determines whether a 
particular character sorts higher, lower, or the same ^ another. Such an ordering is referred 
to as "collating sequence". In one embodiment, transformed Peer_Sequence values are 
ordered according to the American Standard Code for Information Interchange (ASCII). In 
another embodiment, both the Peer Sequence and the transformed Peer_Sequence are ASCII. 
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In another embodiment, transformed Peer_Sequence values are ordered according to tte 
Unicode system. As was previously described, the Peer_Sequence value of the first node 
among siblings of the same parent can indicate the default interval by which the 
Peer_Sequence value of a subsequent sibling will be incremented. For example, if the 
Peer Sequence of the first sibling is "10", the Peer_Sequence value for each subsequent 
sibling node is incremented by "10" 

[0032] Referring now to Fig. 3, a flow diagram of a method for generating a 
TREE_SORT construct is described. At this pomt, it is beneficial to describe properties of 
the TElEE_SORT construct hi a preferred embodiment, a TREE_SORT construct is defined 
as a STRINGdata type. The STRING data type can be of a CHARACTER (CHAR) and 
CHARACTER VARYING (VARCHAR) data type. If TREE_SORTisof aCHARdata 
type, a fixed number of characters are allocated for the TREE_SORT field that stores a 
TREE_SORT value. If the length of the TREE_SORT value is shorter than the fixed length 
of the allocated TREE_SORT field, in one implementation, the remaining characters of the 
TREE_SORT value are filled, to the right of the generated value, with characters that sort 
before text characters and digital characters. Jxx one implementation, the remaining characters 
are filled with blank characters. 

[00331 In another implementation, when TREE_SORT is of a VARCHAR data type, a 
maximum length of the TREE^SORT field is specified. The length of the TREE_SORT 
construct (LENGTH) is a product of a given width and the maximum number of levels 
(MAXJLEVEL) m the hierarchy. For example, if the width is 2 and the maximum number of 
levels is 15, the LENGTH of the TREE_SORT construct is 30. Alternatively, in another 
implementation, if the transformed Peer Sequence value of the root node is not included in 
the TREE_SORT to avoid redimdancy, the length of the TREE_SORT construct is a product 
of the (MAX_LEVEL-1) and width. For example, a width of 3 and a length of 120 would 
result in a table that could support up to 41 hierarchical levels. The width indicates the 
number of children that any given node can have. For example, a width of 3 indicates that a 
given node can have up to 999 immediate children (if each transformed Peer_Sequence value 
is expressed only using digit characters); a width of 2 indicates that a given node can have up 
to 99 children. 

[0034] Continuing with tiie description of Fig. 3, at step 3 10, to generate a 

TREE SORT construct, the method traverses the hierarchical data structure, such as the one 

shown in Fig. 1, fix)m the root node. To determine whether the node is the root node, the 
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method determines whether the node has a parent To this end, in one embodiment, the 
method determines \^4iether the Parent OID of the node is hfULL. 

[0035] In one implementation, the method converts 320 the Peer_Sequence value of the 
root node to a lefl-justified substring of characters of a given width. If TREE_SORT field is 
defined as a CHAR data type, and the length of the transformed substring is shorter than the 
specified fixed length of the field, the method adds sufBx padding characters that sort before 
text characters and digital characters when an SQL-like function is applied. 
[0036] The following example will illustrate the implementation of this method. In the 
following example, WIDTH = 2, MAX_LEVEL=5, LENGTH = WIDTH x MAX_LEVEL= 
2x5=10, and INTERVAL = 1 . First, the root node, Alpha, is identified as the node that has 
no parent node. In one implementation, the Parent_OID of the root node- NULL. This node 
is indexed by OID=100. The Peer Sequence value for this node is "1." 
[0037] According to the method, in one implementation, the Peer_Sequence value "1" 
is converted to a left-justified substring of characters of WIDTH=^. In one implementation, 
if the TREE_SORT field is of a CHAR data type, characters that sort before text and digital 
characters are added to the right of the generated substring since the length of the transformed 
value is shorter than the fixed length (10) of the TREE SORT field. In this example, the 
transformed Peer_Sequence string value is "01", and the generated TREE_SORT construct is 
"Olxxxxxxxx", wherein "x" represents a charaicter that sorts before text and digital characters 
and that is being added to the transformed Peer Sequence values to make the length of the 
construct equal to the fixed length of the TREE SORT field. In one implementation, 'V is a 
blank character (space). 

[0038] The process then generates 320 TREE^SORTofthe children ofthe root node. 
To this end, the process identifies nodes having their Parent OBD eqxial to the OID of the root 
node. The process then generates the TREE_SORT for the children nodes in the following 
manner: 

[0039] 1 . The process converts 330 Peer__Sequence values for children nodes to a left- 
justified substring of characters of a given width; 

[0040] 2. The process then concatenates 340 the TREE_SORT of tiie root node with 
the generated substring of the transformed Peer_Sequence value. The process continues 350 
xmtil TREE__SORT values for all nodes in the tree are generated. In one implementation, the 
process uses the PREPEND fimction to add the TREE_SORT of the root node to the left of 
the transformed Peer_Sequence value for the child node. 
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[0041] Continuing with the same example, the process generates a TREE_SORT for all 
children of that node. As shown in Fig. 1, Alpha node has the following children: Bravo, 
Charlie, and Juliet The process converts the Peer Sequence of Bravo to "01", 
Peer_Sequence of Charlie to "02", and Peer_Sequence of Juliet to "03." 
[0042] The process then prepends the TEUBE_SORT value for Alpha (the parent node of 
Bravo, Charlie, and Juliet) to the left of the TREE^SORT for each of these nodes. As a 
result, the generated TREE_SORT for nodes Bravo, Charlie, and Juliet are "0101", "0102", 
and "0103" respectively. 

[0043] Continuing with the same example. Bravo, Charlie, and Julie have their own set 
of children, each with its own set of Peer_Sequence values. For example. Delta is a child of 
Bravo. The Peer Sequence value of Delta is "1". The transformed Peer_Sequence of Delta 
is "01". The TEyEE_SORT of Delta is determined by prepending the TREE_SORT of its 
immediate parent, Bravo, with the transfonned value of "0 1 ". As a resiilt, the generated 
TREE^SORT for Delta is "010101". 

[0044] Table 2 illustrates a relational database representation of the hierarchical data 
structure shown in Fig. 1 that includes a TREE_SORT field. In Table 2, the WIDTH of the 
generated TREE_SORT is 2. Peer_Sequence values start at 1 and have an interval of 1 for 
each subsequent child of a given parent 

Table 2. Relational Database Representation of a Hierarchical Data Structure After Adding 
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[0045] Thus, as shown in Table 2, in a preferred embodiment, a TREE_SORT construct 
of the root node (Alpha) in the hierarchical data structure includes a leading substring that 
represents a transformed Peer_Sequence value. A TREE SORT value of the nodes other 
than the root node includes the following components: a leading substring that represents a 
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TREE SORT value of the parent node, and the transfonned Peer Sequence value for that 
node. Thus, a TREE_SORT construct for a particular node represents the chain of 
Peer_Sequence values starting from the root of the tree structure and going down to the 
specific node. 

[0046] Importantly, the generated TElEE_SORT values are ordered according to the 
same collating sequence as the transformed Peer_Sequence values so that if Peer_Sequence 
value A is ordered before Peer_Sequence value B, the transformed value of the 
Peer Sequence value A sorts before the transformed value of the Peer_Sequence value B. In 
a preferred embodiment of the present invention, TREE_SORT values and Peer_Sequence 
values are ordered according to the collating sequence defined by the ASCII system. 
According to the ASCII collating system, smaller numeric characters are sorted before 
greater numeric characters; text characters are sorted in alphabetical order; and blank symbols 
are sorted before all numeric and text characters. One skilled in the art would understand that 
Peer_Sequence values can be sorted according to any other system in which any subsequent 
value is increasing monotonically. 

[0047] In other implementations, to avoid redundancy, the TREE_SORT value of all 
the nodes omits from the TREE_SORT value the string portion that would have been derived 
from the root node's Peer_Sequence. 

[0048] Since in the preferred embodiment, the generated TREE_SORT values are of a 
STRING data type, the present invention advantageously uses SQL string-type ftinctions, 
such as SORT (ORDER BY) and LIKE, to traverse across multiple levels of a hierarchical 
data structure. As was previously indicated, a TREE_SORT value for a given node, other 
than a root node, includes a leading substring that represents a TREE_SORT value for a 
parent node and the transformed Peer_Sequence value for that node. When a tree data 
structure is sorted according to a TREE_SORT construct, descendant nodes of a particular 
node have a TREE_SORT that sorts after the TREE^SORT value of their common parent 
node. A node is a conmion parent for two child nodes if Parent_OID for these nodes equal to 
OJD of that parent 

[0049] The following example will illxistrate the described concept. Continuing with 
tiie example illustrated in Table 2, TREE_SORT of Bravo is "0101". TREE_SORT of Delta 
(Bravo's child) is "010101." In this example, TREE__SORT values are ordered according to 
the collating sequence of the ASCII system. As was previously noted, TREE_SORT of Delta 
comprises the following components: TREE_SORT of Bravo "0101" and transformed 
Peer^Sequence value of Delta "0 1 ". Similarly, TREE_SORT of JuUet comprises 
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TREE_SORT of Bravo, "0101", and transfonned Peer_Sequence of Juliet, "03". When an^ 
SQL SORT (ORDER BY) query is issued, the TEUEE_^SORT of Alpha ("01") sorts before 
TREE_SORT of Bravo, "0101", since characters that are added to the right of TREE_SORT 
of Alpha are sorted before numeric and text characters. Accordingly, "01" is sorted before 
"0101." 

[0050] Similarly, TREE^SORT of Bravo, "0101", sorts before TREE^SORT of Juliet, 
"010103" because characters added to TREE_SORT of Bravo are sorted before numeric and 
text characters added to "0101," 
Tree Display Ordering 

[0051] As was previously described, often, a database user wants to produce a tree 
display of a hierarchical data structure. Existing solutions for accomplishing this task have 
been left for programmers since producing a tree display of a hierarchical data structure could 
not be accomplished with a single query with the given definition. 

[0052] The present invention has overcome the limitations of the existing solutions by 

issuing a query that sorts elements of a hierarchical data structure according to TREE_SORT 

constructs. The TEUEE_SORT allows the following simple query to produce a tree display of 

a hierarchical data structure: 

Select 

Case 

When LEVEL = 1 Then DATA 

Else concat(Substring(*4-F+++++++',l,LEVEL-l),DATA) 
End, 

TREE^SORT 
From A_TREE 
Order by TEffiE_SORT 

[0053] The above query performed on the example iUustrated in Table 2 will produce 

the following output: 

Alpha 01 

+Bravo 0101 

++Delta 010101 

-fCharlie 0102 
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++Echo 


010201 


++Foxtrot 


010202 


++-K3olf 


01020201 


++++India 


0102020101 


4-H-Hotel 


'01020202' 


+Juliet 


'0103' 



[0054] The sign prefixing the DATA indicates that a given node is a child of the 
preceding node with fewer or no signs. In the display list, descendant nodes of a given 
node have a TREE_SORT that sorts after the TREE_SORT of a common parent Thus, 
descendants of CharUe, such as Golf and Hotel, sort after their conunon parent Foxtrot. 
F.vflTniniTif T Relationships Across Multiple Levels of a Hierarchical Data Structure 

[0055] To find all descendants of a target node in a hierarchical data structure, a simple 
query will be issued that identifies nodes having a TREE_SORT value with a leading 
substring that matches the TBlEE_SORT of the target node. 

[0056] For example, an efficient query to find all the descendants of Charlie (whose 
TREE_SORT = *0102') would find all descendants having a TREE_SORT with a leading 
substring equal to the TREE_SORT value of Charlie. The following query will be issued: 
Select DATA firom A_TEIEE 
where TREE_^SORT like "0102%" 

[0057] This query produces the following output: Echo, Foxtrot, Golf, Hotel, and India. 
Thus, using a TREE_SORT construct to retrieve and sort data m a hierarchical data structure 
is advantageous over prior art solutions that would require performing a pointer-chain 
traversal to find all nodes having a Parent_OID equal to the OID of Charlie or that of 
CharUe's child. Accordingly, a TREE_SORT construct avoids the complexities of navigating 
the tree data structure. 

[0058] To retrieve all ancestors of a target node in a hierarchical data structure, a simple 
query will be issued that identifies nodes having a TREE_SORT value that match a leading 
substring of the target node. 

[0059] To find out if a given node is descendant of a target node, a sunple query will be 
issued to determine whether a leading substring of the TREE_SORT of the given node 
matches the 'nREE_SORT of the target node. 

12 



wo 2005/077123 



PCT/US2005/004653 



[0060] To find out if a given node is an ancestor of a target node, a simple query will be 
issued to determine whether the TREE_SORT value of the given node matches the leading 
substring of the target node. 

[0061] For example, find out if India a descendant of CharUe?" using the generated 
TREE_SORT construct, a database user needs to issue a query to find out if the TREE_SORT 
of Charlie is the leading substring of the TREE_SORT of India. An exemplary SQL query 
fragment is shown below: 

'Svhere substring (India.TREE_^SORT,Ueng<h (CharUe.-EREE.SORT)) = 
Charlie.TREE„SORT" 

[0062] Database engine 260 executes this query as follows. Database engine 260 
parses the TREE__SORT value of India to identify a leading substring that starts with the first 
position and has the length of the CharUe's TREE_SORT construct (four characters). 
TREE_SORTof Indiais '0102020101' and TREE_SORT of Charlie is *0102\ The first four 
characters of TREE_SORT of India are '0102'. Accordingly, Charlie is a direct ancestor of 
India. Thus, using a TREE_SORT construct allows system 200 to efficiently retrieve and 
sort data in a hierarchical data structure thereby avoiding performing convoluted pointer- 
chain traversal queries. 
Alternative Embodiments 

[0063] It should be noted that although Table 2 illustrates database representation of 
one hierarchical data structure, one of skill m the art would understand that multiple 
hierarchies can be stored in one database table. In such an implementation, a MASTER_OID 
field is added to the table to identify nodes of a particular hierarchy within the table so that all 
nodes within a given hierarchy have the same MASTER_OID value. Other implementations 
of the present mvention store in a TREE_SORT field only transformed Peer_Sequence value 
of the ancestors of a particular node but omit the TREE_SORT valiie of the root node to 
avoid redundancy. 

[0064] To query nodes of a given tree structure identified by a unique MASTER_OID 

from the database table that stores multiple hierarchies, database engine 260 issues the 

following query to select records indexed by, for example, MASTER_OII>=100 and ordered 

byTREE_SORT: 

Select * from MANY^TREES 

where MASTER_OID = 100 

order by TREE^SORT 
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[0065] To retrieve all descendants of a target node from a database table that stores 
multiple hierarchies, database engine 260 uses MASTER_OID and the leading substring of 
TREE__SORT of the target to obtain those results. For example, when the target node is 
Charlie, the following query is issued: 
Select * from MANY_TREES 
where MASTER_OrD = 100 
and TREE^SORT like '0102%' 

[0066] In the above query, '0102%' is Charlie*s TREE^SORT value with the the 

SQL string march operator tacked on. Charlie's descendants are nodes having TREE_SORT 

values with a leading substring that matches Charlie's TREE_SORT. In other words, their 

TRRK SORT values start with the TREE^SORT value of CharUe. 

[0067] Thus, a TREE_SORT construct is used to eflBlcientiy query data store 220 that 

holds database records for multiple hierarchical data structures. 

Adding Nodes to the Hierarchical Data Structure 

[0068] According to an embodiment of the present invention, adding new nodes to a 
tree data structure involves creating a new OID, Parent_OID, and Peer_Sequence. To this 
end, it is desirable to set the interval for Peer Sequence such that new sibUng nodes can be 
added. However, if there is not enough room to add a new Peer Sequence value or the 
Interval=l, opening up a place for a new node involves renumbering Peer_Sequence values. 
The scope of this renumbering of Peer_Sequence is limited to the siblings of the new node 
having the Peer_Sequence value greater than that of the new node. The changes to the 
TREE_SORT values involve changes to the TREE_SORT for those sibUngs and their 
descendants. 

XML Database Representation 

[0069] One application of the TREE_SORT construct is for traversing XML 
documents. An XML document includes a plurality of elements that are organized in a . 
hierarchical order. An element in the XML document may be associated with a value. A 
root element is the topmost element in the document that has no ancestor elements. An XML 
document also has leaf elements and internal elements. An internal element is an element 
that has one or more child elements. A leaf element has no children. All children of the 
same element are siblings. 
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[0070] An exemplary XML document is shown bellow. 
<Alpha lJRL=*Titlp://www.nowhere.nohow.fake/example.x^ 
<Bravo> 

<Delta>HeUo There</Delta> 

</Bravo> 

<Charlie> 

<Echo>Agam<yEcho> 
<Foxtro1> 
<Golf> 

<India>500.5<;/India> 
<yGolf> 

<Hotel>No Tell</Hotel> 
</Foxtrot> 
<;/Charlie> 

<JulietX/Juliet> ' 
</Alpha> 



[0071] In the XML docimient shown above. Alpha, Bravo, Delta, etc. are elements. 
Alpha is a root element in the hierarchy; "Bravo" and "Charlie" are children of Alpha. Thus, 
Bravo and Charlie are siblings. In the XML document. Delta has a value of *TH[ello There", 
and Echo has the value "Again." 

[0072] Referring again to Fig. 1 , elements of the XML document shown above are 
illustrated in the hierarchical data structure shown in Fig. 1 . Each element is identiJBed by its 
ODD, the Parent_OID, Level, and the Peer_Sequence value. 

[0073] The above XML document can be represented as records in a relational 
database. The representation of the XML document in the relational database is shown 
below in Table 3. 
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Table 3: Representation of XML Document in Relational Database 



on) 


Peer 


LEVEL 


ELEMENT 


ELEMENT 




Sequence 




NAME 


VALUE 


100 


1 


1 


Alpha 




200 


1 


2 


Bravo 




300 


2 


2 


Charlie 




400 


1 


3 


Delta 


Hello Hiere 


600 


1 


3 


Echo 


Again 


500 


2 


3 


Foxtrot 




700 


1 


4 


Golf 




800 


2 


4 


Hotel 


NoTeU 


900 


1 


5 


India 


500.5 


1000 


3 


2 


Juliet 





[0074] As shown in Table 3, each record has a plurality of fields. The DATA field has 
been expanded into ELEMENT_NAME, ELEMENT_VALUE, ATTRIBUTE_NAME (not 
shown), ATTRIBUTE_SEQUENCE (not shown) and ATTRIBUTE_VALUE (not shown). 
Other fields in the Table are the OID, the Parent_OID, and Level. 
[0075] According to an embodiment of the present invention, sorting construct 
generation engine 240 uses the Peer_Sequence value for each element in the XML document 
to generate TREE_SORT constructs according to an algorithm described earlier. Table 4 
below illustrates representation of the XML docimient in the database table with generated 
TREE_SORT values. 

Table 4. Representation of XML document in Relational Database with TREE_SORT 



construct 



on) 


Peer 


LEVEL 


TREE SORT 


ELEMENT 


ELEMENT 


Sequence 






NAME 


VALUE 


100 




1 


'or 


Alpha 




200 




2 


'oior 


Bravo 




300 




2 


•0102' 


Charlie 




400 




3 


•oioior 


Delta 


Hello There 


600 




3 


'010201' 


Echo 


Again 


500 




3 


'010202' 


Foxtrot 




700 




4 


*0102020r 


Golf . 




800 




4 


'01020202' 


Hotel 


No Tell 


900 




5 


'0102020101' 


India 


500.5 


1000 




2 


'0103' 


Juliet 





[0076] Although in a preferred embodiment of the present invention, systeni 200 
generates a TREE_SORT construct of a string data type to benefit firom string-type fimctions 
of the SQL to efiBcientiy query hierarchical data structures represented in relational 
databases, a person skilled in the art would understand that a TREE_SORT construct can be 
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of any data type to benefit fix)m the capabilities of SQL sort-type functions to efficiently 
query hierarchical data structures. 

[0077] A person skilled in the art woiild understand that the present invention could be 
utilized with respect to any type of data Notably, the present invention can be used in 
healthcare information systems. For example, the present invention can be utilized in the 
following aspects of the management of healthcare information systems: 

• Evaluation and management coding. A TREE_SORT construct is used to perform 
evaluation and management aggregation and summary of obtained findings for a 
patient into the complex counts and categorizations within a single query. 

• Chartable Item Templates. A TREE_SORT construct can be used to efficientiy 
portray and traverse hierarchical data structures and effectively use internal and 
external sub-trees; to portray expandable and collapsible lines in a flowsheet and to 
guide the interaction and charting fix)m mini-forms for charting. 

• Aggregates. Using A TREE_SORT construct yields an effective and easy traversal of 
aggregates in healthcare systems. 

• Clinical Notes/Documents. A TREE SORT constnixct is used for compact and 
efficient representation of composed phrases/sentences/paragraphs/sections since 
clinical documentation can be represented as hierarchical data structures. 

• Hierarchical Findings. A TREE_SORT construct is used to roll-up and perform 
dependency determination. The TREE^SORT construct allows for efficient use of 
hierarchical clinical context mini-trees to represent interconnected observations, and 
provides a bridge between post-coordination and the atomic medical event 

• Hierarchical selection lists. 

• A TREE SORT construct is used in general nomenclature subsets. 

[0078] The present invention has been described in particular detail with respect to 
several possible embodiments. Those of skill in the art will appreciate that the invention may 
be practiced in other embodiments. First, the particular naming of the components, 
capitalization of terms, the attributes, data structures, or any other programming or structural 
aspect is not mandatory or significant, and the mechanisms that implement the invention or 
its features may have different names, formats, or protocols. Further, the system may be 
implemented via a combination of hardware and software, as described, or entirely in 
hardware elements. Also, the particular division of functionality between the various system 
components described herein is merely exemplary, and not mandatory; fimctions performed 



17 



wo 2005/077123 



PCT/US2005/004653 



by a single system component may instead be performed by multiple components, and 
functions performed by multiple components may instead performed by a single component 
[0079] Some portions of above description present the features of the present invention 
in terms of algorithms and symbolic representations of operations on information. These 
algorithmic descriptions and representations are the means used by those skilled in the data 
processing arts to most effectively convey the substance of their work to others skilled in the 
art These operations, while described functionally or logically, are understood to be 
implemented by computer programs. Furthermore, it has also proven convenient at times, to 
refer to these arrangements of operations as modules or by functional names, without loss of 
generality. 

[0080] Unless specifically stated otherwise as apparent from the above discussion, it is 
appreciated that throughout flie description, discxjssions utilizing terms such as '^processing" 
or "computing" or "calculating" or "determining" or "displaying" or the like, refer to the 
action and processes of a computer system, or similar electronic computing device, that 
manipulates and transforms data represented as physical (electronic) quantities within the 
computer system memories or registers or other such information storage, transmission or 
display devices. 

[0081] Certain aspects of the present invention include process steps and instructions 
described herein in the form of an algorithm. It should be noted that the process steps and 
instructions of the present invention could be embodied in sofhvare, firmware or hardware, 
and when embodied in software, could be downloaded to reside on and be operated from 
difierent platforms used by real time network operating systems. 

[0082] The present invention also relates to an apparatus for performing the operations 
herein. This apparatus may be specially constructed for the required purposes, or it may 
comprise a general-purpose computer selectively activated or reconfigured by a computer 
program stored on a computer.readable medium that can be accessed by the computer. Such 
a computer prognon may be stored in a computer readable storage medium, such as, but is 
not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic- 
optical disks, read-only memories (ROMs), random access memories (RAMs), EPROMs, 
EEPROMs, magnetic or optical cards, application specific integrated circuits (ASICs), or any 
type of media suitable for storing electronic instructions, and each coupled to a computer 
system bus. Fiirthermore, the computers referred to in the specification may include a single 
processor or may be architectures employing multiple processor designs for increased 
computing capability. 
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[0083] The algorithms and operations presented herein are not inherently related to any 
particular computer or other apparatus. Various general-purpose systems may also be used 
with programs in accordance with the teachings herein, or it may prove convenient to 
construct more specialized apparatus to perform the required method steps. The required 
structure for a variety of these systems will be apparent to those of skill in the, along with 
equivalent variations. In addition, the present invention is not described with reference to any 
particiilar programming language. It is appreciated that a variety of programming languages 
may be used to implement the teachings of the present invention as described herein, and any 
references to specific languages are provided for disclosure of enablement and best mode of 
the present invention. 

[0084] The present invention is well suited to a wide variety of computer network 
systems over numerous topologies. Within tiiis field, the configuration and management of 
large networks comprise storage devices and computers that are communicatively coupled to 
dissimilar computers and storage devices over a network, such as the Internet 
Finally, it should be noted that the language used in the specification has been principally 
selected for readability and instructional purposes, and may not have been selected to 
delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the 
present invention is intended to be illustrative, but not limiting, of the scope of the invention, 
which is set forth in the following claims. 
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I CLAM: 



1. A computer-implemented method for efficiently indexing a hierarchical data 
structure stored in a database, the method comprising: 

creating an instance of a database in which elements of a hierarchical data 

structure are in an ancestor-descendant relationship and an element is 

indexed by a peer sequence value that identifies sequential ordering of 

elements having a same parent; 
transforming a peer sequence value for a parent element into a sorting construct of 

a given width to generate a sorting construct of the parent element; 
transforming a peer sequence value of a child element into a sorting construct of 

the given width; and 
prepending the sorting construct of the parent element to the left of the 

transformed peer sequence value of the child element to generate a sorting 

construct of the child element. 

2. The method of claim 1, wherein peer sequence values and sorting constructs are 
ordered according to the same collating sequence. 

3. The method of claim 1, wherein peer sequence values are ordered according to the 
ASCn system. 

4. The method of claim 1, \;s^erein peer sequence values are ordered according to the 
Unicode system. 
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5. The method of claim 1, wherein the sorting construct is of a string data type. 

6. The method of claim 1, further comprising adding characters to the sorting 
construct of a parent element, the added characters are sorted before text characters and 
digital characters. 

7. The method of claim 1, wherein multiple hierarchical data structures are stored in 
the database. 

8. The method of claim 1, wherein the hierarchical data structure describes an XML 
document. 

9. The method of claim 1, wherein the length of the sorting construct is a function of 
a given width of the sorting construct and a maximum number of levels in the hierarchical 
data structure. 

10. The method of claim 1, wherein peer sequence values are monotonically 
increasing among children of the same parent element 

1 1 . A system for efBciently indexing a hierarchical data structure in a database, the 
system comprises: 

the database for storing a hierarchical data structure, wherein elements in the 

hierarchical data structure are in a parent-child relationship, wherein an 
element is indexed by a peer sequence value that identifies sequential 
ordering of elements having a same parent element; 
a sorting construct generation module adapted to perform the steps of: 
transforming a peer sequence value for a parent element into a sorting 

construct of a given width to generate a sorting tree construct of the 
parent element, 

transforming a peer sequence value of a child element into a sorting construct 

of the given width, and 
prepending the sorting tree construct of the parent element to the transformed 

peer sequence value of the child element to generate a sorting tree 

construct of the child element 
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12. The system of claim 11, further comprising a database engine module adapted to 
efiBciendy execute queries to the database using the generated sorting construct 

13. The system of claim 11, wterein the database comprises a plurality of tables, and 
wherein a table in the database is adapted to store multiple hierarchical data structures. 

14. The system of claim 11, wherein peer sequence valxies and sorting constructs are 
ordered according to the same collating sequence. 

15. The system of claim 1 1, \^dierein the sorting construct generation module is 
further adapted to add characters to the sorting construct of a parent element, the added 
characters are sorted before text characters and digital characters. 

16. The system of claim 1 1, \?^dierein the sorting construct has a length, and the length 
of the sorting construct is a function of the given width of the sorting construct and a 
maximum number of levels in the hierarchical data structure, 

1 7. A method for efficiently identifying a descendant of an element in a hierarchical 
data structure, the mefliod comprising: 

creating an instance of a database in which elements of the hierarchical data 

structure are in an ancestor-descendant relationship and each element is 
indexed by a peer sequence value that identifies sequential ordering of 
elements having the same parent; 

generating a sorting construct for an element by transforming a peer sequence 
value for the element and prepending a leading substring, the leading 
substring is representative of a sorting construct of an ancestor of the 
element; and 

identifying elements in the data structure having a leading substring equal to the 
sorting construct of the element 

1 8. A method for efficientiy identifying an ancestor of an element in a hierarchical 
data structure, the method comprising: 

creating an instance of a database in which elements of the hierarchical data 

structure are in an ancestor-descendant relationship and each element is 
indexed by a peer sequence value that identifies sequential ordering of 
elements having the same parent; 

generating a sorting construct for an element by transforming a peer sequence 
value for the element and prepending a leading substring, the leading 
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substring is representative of a sorting construct of an ancestor of the 
element; and 

identifying elements in the data structure witih the sorting construct equal to the 

leading substring of the element 
19. A computer program product comprising: 

a computer-readable mediimi having computer program code embodied therein 

for efficiently indexing a hierarchical data structure in a database, the 

computer program code adapted to: 
create an instance of a database in which elements of a hierarchical data structure 

are in an ancestor-descendant relationship and an element is indexed by a 

peer sequence value that identifies sequential ordering of elements having 

a same parent; 

transform a peer sequence value for a parent element into a sorting construct of a 
given width to generate a sorting construct of the parent element 

transform a peer sequence value of a child element iuto a sorting construct of the 
given width; and 

prepend the sorting construct of the parent element to the left of the transformed 
peer sequence value of the child element to generate a sorting construct of 
the child element 
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